Analytical Mean Squared Error Curves in Temporal Difference Learning
نویسندگان
چکیده
Peter Dayan Brain and Cognitive Sciences E25-210, MIT Cambridge, MA 02139 [email protected] We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal difference value estimation algorithms change with offline updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve behavior in various chains, and show the manner in which TD is sensitive to the choice of its stepsize and eligibility trace parameters.
منابع مشابه
Analytical Mean Squared Error Curves in Temporal Di erence Learning
We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal di erence value estimation algorithms change with o ine updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve behavior in various chains, and show the manner in which TD is sensitive to the choice of its steps...
متن کاملAnalytical Mean Squared Error Curves in Temporal Diierence Learning
We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal diierence value estimation algorithms change with ooine updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve behavior in various chains, and show the manner in which TD is sensitive to the choice of its step-...
متن کاملEvaluation of remote sensing indicators in drought monitoring using machine learning algorithms (Case study: Marivan city)
Remote sensing indices are used to analyze the Spatio-temporal distribution of drought conditions and to identify the severity of drought. This study, using various drought indices generated from Madis and TRMM satellite data extracted from Google Earth Engine (GEE) platform. Drought conditions in Marivan city from February to November for the years 2001 to 2017 were analyzed based on spatial a...
متن کاملUsing Machine Learning ARIMA to Predict the Price of Cryptocurrencies
The increasing volatility in pricing and growing potential for profit in digital currency have made predicting the price of cryptocurrency a very attractive research topic. Several studies have already been conducted using various machine-learning models to predict crypto currency prices. This study presented in this paper applied a classic Autoregressive Integrated Moving Average(ARIMA) model ...
متن کاملLinear Stochastic Approximation: Constant Step-Size and Iterate Averaging
We consider d-dimensional linear stochastic approximation algorithms (LSAs) with a constant step-size and the so called Polyak-Ruppert (PR) averaging of iterates. LSAs are widely applied in machine learning and reinforcement learning (RL), where the aim is to compute an appropriate θ∗ ∈ R (that is an optimum or a fixed point) using noisy data and O(d) updates per iteration. In this paper, we ar...
متن کامل